Privacy-preserving Classification of Data Streams
نویسنده
چکیده
Data mining is the information technology that extracts valuable knowledge from large amounts of data. Due to the emergence of data streams as a new type of data, data streams mining has recently become a very important and popular research issue. There have been many studies proposing efficient mining algorithms for data streams. On the other hand, data mining can cause a great threat to data privacy. Privacy-preserving data mining hence has also been studied. In this paper, we propose a method for privacy-preserving classification of data streams, called the PCDS method, which extends the process of data streams classification to achieve privacy preservation. The PCDS method is divided into two stages, which are data streams preprocessing and data streams mining, respectively. The stage of data streams preprocessing uses the data splitting and perturbation algorithm to perturb confidential data. Users can flexibly adjust the data attributes to be perturbed according to the security need. Therefore, threats and risks from releasing data can be effectively reduced. The stage of data streams mining uses the weighted average sliding window algorithm to mine perturbed data streams. When the classification error rate exceeds a predetermined threshold value, the classification model is reconstructed to maintain classification accuracy. Experimental results show that the PCDS method not only can preserve data privacy but also can mine data streams accurately.
منابع مشابه
Privacy Preserving Data Stream Classification Using Data Perturbation Techniques
Data stream can be conceived as a continuous and changing sequence of data that continuously arrive at a system to store or process. Examples of data streams include computer network traffic, phone conversations, web searches and sensor data etc. These data sets need to be analyzed for identifying trends and patterns, which help us in isolating anomalies and predicting future behavior. However,...
متن کاملPrivacy-Preserving Data Stream Classification
In a wide range of applications, multiple data streams need to be examined together in order to discover trends or patterns existing across several data streams. One common practice is to redirect all data streams into a central place for joint analysis. This “centralized” practice is challenged by the fact that data streams often are private in that they come from different owners. In this pap...
متن کاملPrivacy-Preserving Classification for Data Streams
In a wide range of applications, multiple data streams need to be examined together in order to discover trends or patterns existing across several data streams. One common practice is to redirect all data streams into a central place for joint analysis. This “centralized” practice is challenged by the fact that data streams often are private in that they come from different owners. In this pap...
متن کاملPrivacy-Preserving Distributed Stream Monitoring
Applications such as sensor network monitoring, distributed intrusion detection, and real-time analysis of financial data necessitate the processing of distributed data streams on the fly. While efficient data processing algorithms enable such applications, they require access to large amounts of often personal information, and could consequently create privacy risks. Previous works have studie...
متن کاملPrivacy-Preserving Distributed Stream Monitoring (NDSS 2014)
Applications such as sensor network monitoring, distributed intrusion detection, and real-time analysis of financial data necessitate the processing of distributed data streams on the fly. While efficient data processing algorithms enable such applications, they require access to large amounts of often personal information, and could consequently create privacy risks. Previous works have studie...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008